Word Semantic Similarity Calculation Based on Domain Knowledge and HowNet
نویسندگان
چکیده
Word semantic similarity is the foundation of semantic processing, and is a key issue in many applications. This paper argues that word semantic similarity should associate with domain knowledge, which traditional methods did not take into account. In order to adopt domain knowledge into semantic similarity measurement, this paper proposed a sensitive words sets approach. For this purpose, we also propose a new approach for sememe similarity calculation. This method distinguishes three different positional relationships between two sememes, and the results have shown that our method overperformed than other methods based on a Chinese knowledge base ‘HowNet’.
منابع مشابه
Semantic Similarity Calculation of Chinese Word
This paper puts forward a two layers computing method to calculate semantic similarity of Chinese word. Firstly, using Latent Dirichlet Allocation (LDA) subject model to generate subject spatial domain. Then mapping word into topic space and forming topic distribution which is used to calculate semantic similarity of word(the first layer computing). Finally, using semantic dictionary"HowNet" to...
متن کاملThe Research of Chinese Words Semantic Similarity Calculation with Multi-Information
Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledg...
متن کاملChinese HowNet-Based Multi-factor Word Similarity Algorithm Integrated of Result Modification
In this paper, we firstly describe a novel approach to calculate the Chinese sememe similarity based on the HowNet hierarchical sememe tree. When we calculate the sememe similarity, we not only take Semantic Distance, Node Depth and Semantic Coincidence Degree into consideration, but also propose two impact factors named Node Environment Dense (NED) and Node Layer Ratio (NLR) to optimize the ca...
متن کاملE-HowNet: the Expansion of HowNet
HowNet is an on-line common-sense knowledge base unveiling inter-conceptual relations and inter-attribute relations of concepts as connoting in lexicons of the Chinese and their English equivalents [1]. Each concept is represented and understood by their definition and association links to other concepts. To Compare with WordNet, HowNet’s architecture provides richer information apart from hypo...
متن کاملA Maximum Entropy Approach To HowNet-Based Chinese Word Sense Disambiguation
This paper presents a maximum entropy method for the disambiguation of word senses as defined in HowNet. With the release of this bilingual (Chinese and English) knowledge base in 1999, a corpus of 30,000 words was sense tagged and released in January 2002. Concepts meanings in HowNet are constructed by a closed set of sememes, the smallest meaning units, which can be treated as semantic tags. ...
متن کامل